MIL: Automatic Metaphor Identification by Statistical Learning

نویسندگان

  • Yosef Ben Shlomo
  • Mark Last
چکیده

Metaphor identification in text is an open problem in natural language processing. In this paper, we present a new, supervised learning approach called MIL (Metaphor Identification by Learning), for identifying three major types of metaphoric expressions without using any knowledge resources or handcrafted rules. We derive a set of statistical features from a corpus representing a given domain (e.g., news articles published by Reuters). We also use an annotated set of sentences, which contain candidate expressions labelled as 'metaphoric' or 'literal' by native English speakers. Then we induce a metaphor identification model for each expression type by applying a classification algorithm to the set of annotated expressions. The proposed approach is evaluated on a set of annotated sentences extracted from a corpus of Reuters articles. We show a significant improvement vs. a state-of-the-art learning-based algorithm and comparable results to a recently presented rule-based approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Joint Clustering and Classification for Multiple Instance Learning

The Multiple Instance Learning (MIL) framework has been extensively used to solve weakly labeled visual classification problems, where each image or video is treated as a bag of instances. Instance Space based MIL algorithms construct a classifier by modifying standard classifiers by defining the probability that a bag is of the target class as the maximum over the probabilities that its instan...

متن کامل

Metaphor Identification Using Verb and Noun Clustering

We present a novel approach to automatic metaphor identification in unrestricted text. Starting from a small seed set of manually annotated metaphorical expressions, the system is capable of harvesting a large number of metaphors of similar syntactic structure from a corpus. Our method is distinguished from previous work in that it does not employ any hand-crafted knowledge, other than the init...

متن کامل

Incorporating multiple SVMs for automatic image annotation

In this paper, a novel automatic image annotation system is proposed, which integrates two sets of support vector machines (SVMs), namely the multiple instance learning (MIL)-based and global-feature-based SVMs, for annotation. The MIL-based bag features are obtained by applying MIL on the image blocks, where the enhanced diversity density (DD) algorithm and a faster searching algorithm are app...

متن کامل

Multiple Instance Learning from Weakly Labeled Videos

Automatic video tagging systems are targeted at assigning semantic concepts (“tags”) to videos by linking textual descriptions with the audio-visual video content. To train such systems, we investigate online video from portals such as YouTubeas a large-scale, freely available knowledge source. Tags provided by video owners serve as weak annotations indicating that a target concept appears in a...

متن کامل

The Study of Automatic and Controlled Data Processing Speed Based on the Stroop Test in Students with Math Learning Disability

Introduction: The study of individual differences in information processing in order to predict the academic achievement of students with math disability is of great importance. The purpose of this study was to study automatic and controlled data processing speed based on the Stroop test in students with math learning disability. Materials and Methods: This descriptive study was causal-comparat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015